Skip to content

Conversation

@luancazarine
Copy link
Collaborator

@luancazarine luancazarine commented Sep 29, 2025

Resolves #16762

Summary by CodeRabbit

  • New Features

    • Search creators across supported platforms with pagination and optional limits.
    • Fetch a creator’s profile by platform and handle or URL.
    • New "New Profile Update" source that polls profiles, computes diffs, and emits change events with standardized payloads and summaries.
    • Expanded platform options for searches and profile lookups; improved polling and pagination support.
  • Chores

    • Version bumped to 0.1.0 and dependency on @pipedream/platform added.

…ns for fetching creator profiles and searching creators, and introduce constants and utility functions for improved functionality.
@luancazarine luancazarine linked an issue Sep 29, 2025 that may be closed by this pull request
@vercel
Copy link

vercel bot commented Sep 29, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
pipedream-docs Ignored Ignored Sep 29, 2025 3:23pm
pipedream-docs-redirect-do-not-edit Ignored Ignored Sep 29, 2025 3:23pm

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 29, 2025

Walkthrough

Adds a scrapecreators app with HTTP helpers, platform constants, pagination, and a deep-object diff utility. Introduces search and fetch actions, a reusable polling base, a New Profile Update source that diffs profiles and emits change events, and bumps package version and dependencies.

Changes

Cohort / File(s) Summary
Actions: Search & Fetch
components/scrapecreators/actions/search-creators/search-creators.mjs, components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs
New actions: search creators with pagination and optional limit; fetch a creator profile with platform-aware params, summary export, and error/partial-success handling.
App Integration & HTTP
components/scrapecreators/scrapecreators.app.mjs
New app module with axios-based request helper, propDefinitions, searchCreators, fetchCreatorProfile, parseParams/parsePath, and an async paginate generator.
Common: Constants
components/scrapecreators/common/constants.mjs
New platform groupings exported: PATH_PLATFORMS, URL_PLATFORMS, SEARCH_PLATFORMS, HANDLE_PLATFORMS, PLATFORMS.
Common: Utils
components/scrapecreators/common/utils.mjs
New exported getObjectDiff(obj1, obj2) that computes deep diffs (added/modified/deleted, nested changes).
Sources: Base Polling
components/scrapecreators/sources/common/base.mjs
New reusable polling source with lastId tracking, deploy preload, emitEvent logic, and standardized event payloads.
Source: New Profile Update
components/scrapecreators/sources/new-profile-update/new-profile-update.mjs, components/scrapecreators/sources/new-profile-update/test-event.mjs
New source that polls fetchCreatorProfile, diffs against stored profile via getObjectDiff, emits events on changes, and includes a test event payload.
Package Metadata
components/scrapecreators/package.json
Version bumped to 0.1.0 and dependencies added (@pipedream/platform ^3.1.0).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Action as Action: Fetch Creator Profile
  participant App as App: scrapecreators
  participant API as ScrapeCreators API

  User->>Action: Run with platform, profileId
  Action->>App: fetchCreatorProfile({ platform, profileId })
  App->>App: parsePath(platform), parseParams(platform, profileId)
  App->>API: GET /{path} with params/headers
  API-->>App: Response (profile or message)
  App-->>Action: Profile data
  Action-->>User: Summary + result or message
  Note over Action,App: Error path throws unless response.data?.success is true
Loading
sequenceDiagram
  autonumber
  actor User
  participant Action as Action: Search Creators
  participant App as App: scrapecreators
  participant API as ScrapeCreators API

  User->>Action: Run with platform, query, limit?
  Action->>App: paginate({ fn: searchCreators, params, platform, maxResults: limit })
  loop Pagination
    App->>API: GET /search with cursor/params
    API-->>App: Items + next cursor
    App-->>Action: yield items
  end
  Action-->>User: Collected items + summary
Loading
sequenceDiagram
  autonumber
  participant Timer as Timer
  participant Source as Source: New Profile Update
  participant App as App: scrapecreators
  participant DB as DB
  participant PD as Pipedream

  Timer->>Source: run/deploy
  Source->>DB: read lastProfile
  Source->>App: fetchCreatorProfile({ platform, profileId })
  App-->>Source: currentProfile
  Source->>Source: getObjectDiff(lastProfile, currentProfile)
  alt Diff not empty
    Source->>PD: $emit({ profile, diff }, meta)
    Source->>DB: write currentProfile
  else No changes
    Source-->>Timer: noop
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I nibbled code and hopped through streams,
New profiles bloom like data-dreams.
I sniff the diffs—what changed today?
A follower here, a bio sway.
With gentle paws, I page and poll—
Hop, emit! The events roll. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning While the PR implements the search-creators and fetch-creator-profile actions and adds the new-profile-update source along with shared utilities and constants, it does not include the add-creator-to-tracking action or the new-creator-instant and new-video-instant webhook sources specified in issue #16762. Implement the missing add-creator-to-tracking action as well as the new-creator-instant and new-video-instant sources, or clarify if those endpoints are to be delivered in a separate PR.
Description Check ⚠️ Warning The pull request description only contains “Resolves #16762” and does not follow the repository’s description template, which requires a “## WHY” section and context explaining the purpose and motivation of the changes. Please complete the description using the provided template by adding a “## WHY” section that outlines the problem being solved, key changes made, and any relevant implementation details.
Title Check ❓ Inconclusive The pull request title “16762 components scrapecreators” is generic and does not clearly convey the primary change implemented; it includes the issue number and a vague reference to “components scrapecreators” without describing what functionality has been added or modified. Please update the title to a concise, descriptive phrase that summarizes the key changes, for example “Add scrapecreators integration actions and sources.”
✅ Passed checks (2 passed)
Check name Status Explanation
Out of Scope Changes Check ✅ Passed All added files (actions for scrapecreators, common constants and utils, the application integration module, base polling source, and the new-profile-update source) directly support the scrapecreators integration and align with the linked issue’s scope.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 16762-components-scrapecreators

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8857995 and fcd8375.

📒 Files selected for processing (1)
  • components/scrapecreators/actions/search-creators/search-creators.mjs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • components/scrapecreators/actions/search-creators/search-creators.mjs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Lint Code Base
  • GitHub Check: Verify TypeScript components
  • GitHub Check: Publish TypeScript components
  • GitHub Check: pnpm publish

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (10)
components/scrapecreators/sources/new-profile-update/test-event.mjs (1)

48-49: Consider normalizing collection fields

tags is a comma-separated string. Prefer an array (e.g., ["Channel", "Channel Name", "Channel Description"]) to avoid downstream splitting and i18n issues.

components/scrapecreators/common/utils.mjs (1)

1-46: Harden deep-diff: handle arrays/non‑plain objects and avoid false negatives

Current recursion treats any object as nested. Non‑plain objects like Date, RegExp, Map, Set, and arrays can produce incorrect diffs (e.g., two different Dates appear unchanged). Also, there’s no cycle protection.

Refactor to:

  • Recurse only into plain objects.
  • Compare arrays element‑wise (or treat as modified if length/ordering differences are unacceptable).
  • Treat non‑plain objects as value types (strict equality or .valueOf() for Date).
  • Add a visited set to prevent infinite recursion.

Patch sketch:

-export function getObjectDiff(obj1, obj2) {
-  const diff = {};
+export function getObjectDiff(obj1, obj2, _seen = new WeakSet()) {
+  const diff = {};
+  const isPlainObject = (v) => v && typeof v === "object" && v.constructor === Object;
+  const isArray = Array.isArray;
+  const isNonPlainObject = (v) =>
+    v && typeof v === "object" && !isPlainObject(v) && !isArray(v);
+
+  // Cycle guard
+  if (obj1 && typeof obj1 === "object") {
+    if (_seen.has(obj1)) return diff;
+    _seen.add(obj1);
+  }

   // Check for differences in obj1's properties
-  for (const key in obj1) {
-    if (Object.prototype.hasOwnProperty.call(obj1, key)) {
+  for (const key of Object.keys(obj1 ?? {})) {
+    if (Object.prototype.hasOwnProperty.call(obj1, key)) {
       if (!Object.prototype.hasOwnProperty.call(obj2, key)) {
         diff[key] = {
           oldValue: obj1[key],
           newValue: undefined,
           status: "deleted",
         };
-      } else if (typeof obj1[key] === "object" && obj1[key] !== null &&
-                 typeof obj2[key] === "object" && obj2[key] !== null) {
-        const nestedDiff = getObjectDiff(obj1[key], obj2[key]);
+      } else if (isArray(obj1[key]) && isArray(obj2[key])) {
+        // Array diff: element-wise
+        const a1 = obj1[key], a2 = obj2[key];
+        const max = Math.max(a1.length, a2.length);
+        const arrDiff = {};
+        for (let i = 0; i < max; i++) {
+          if (i in a1 && !(i in a2)) {
+            arrDiff[i] = { oldValue: a1[i], newValue: undefined, status: "deleted" };
+          } else if (!(i in a1) && i in a2) {
+            arrDiff[i] = { oldValue: undefined, newValue: a2[i], status: "added" };
+          } else if (i in a1 && i in a2) {
+            if (isPlainObject(a1[i]) && isPlainObject(a2[i])) {
+              const nd = getObjectDiff(a1[i], a2[i], _seen);
+              if (Object.keys(nd).length) arrDiff[i] = { status: "modified", changes: nd };
+            } else if (a1[i] !== a2[i]) {
+              arrDiff[i] = { oldValue: a1[i], newValue: a2[i], status: "modified" };
+            }
+          }
+        }
+        if (Object.keys(arrDiff).length) {
+          diff[key] = { status: "modified", changes: arrDiff };
+        }
+      } else if (isPlainObject(obj1[key]) && isPlainObject(obj2[key])) {
+        const nestedDiff = getObjectDiff(obj1[key], obj2[key], _seen);
         if (Object.keys(nestedDiff).length > 0) {
           diff[key] = {
             status: "modified",
             changes: nestedDiff,
           };
         }
+      } else if (isNonPlainObject(obj1[key]) || isNonPlainObject(obj2[key])) {
+        // Compare non-plain objects as values (e.g., Date.valueOf())
+        const v1 = obj1[key] instanceof Date ? obj1[key].valueOf() : obj1[key];
+        const v2 = obj2[key] instanceof Date ? obj2[key].valueOf() : obj2[key];
+        if (v1 !== v2) {
+          diff[key] = { oldValue: obj1[key], newValue: obj2[key], status: "modified" };
+        }
       } else if (obj1[key] !== obj2[key]) {
         diff[key] = {
           oldValue: obj1[key],
           newValue: obj2[key],
           status: "modified",
         };
       }
     }
   }

   // Check for properties added in obj2
-  for (const key in obj2) {
-    if (Object.prototype.hasOwnProperty.call(obj2, key)) {
+  for (const key of Object.keys(obj2 ?? {})) {
+    if (Object.prototype.hasOwnProperty.call(obj2, key)) {
       if (!Object.prototype.hasOwnProperty.call(obj1, key)) {
         diff[key] = {
           oldValue: undefined,
           newValue: obj2[key],
           status: "added",
         };
       }
     }
   }

   return diff;
 }

Please confirm if profile snapshots can contain Dates, Maps/Sets, or specialized objects. If strictly JSON (primitives/arrays/plain objects), we can simplify the array handling and skip non‑plain types.

components/scrapecreators/common/constants.mjs (1)

1-43: Stabilize and future‑proof platform lists

  • Dedupe PLATFORMS to avoid duplicates as groups evolve.
  • Freeze exports to prevent accidental mutation at runtime.
  • Optional: rename "user/boards" to userBoards to avoid slash‑key edge cases.

Example:

-export const PATH_PLATFORMS = {
+export const PATH_PLATFORMS = Object.freeze({
-  "user/boards": [
+  // Consider: userBoards
+  "user/boards": [
     "pinterest",
   ],
   ...
-};
+});
 
-export const URL_PLATFORMS = [
+export const URL_PLATFORMS = Object.freeze([
   "linkedin",
   "facebook",
   ...PATH_PLATFORMS["empty"],
-];
+]);
 
-export const SEARCH_PLATFORMS = [
+export const SEARCH_PLATFORMS = Object.freeze([
   "tiktok",
   "threads",
-];
+]);
 
-export const HANDLE_PLATFORMS = [
+export const HANDLE_PLATFORMS = Object.freeze([
   "instagram",
   "twitter",
   "truthsocial",
   "bluesky",
   "twitch",
   "snapchat",
   ...SEARCH_PLATFORMS,
   ...PATH_PLATFORMS["user/boards"],
   ...PATH_PLATFORMS["channel"],
-];
+]);
 
-export const PLATFORMS = [
-  ...URL_PLATFORMS,
-  ...HANDLE_PLATFORMS,
-];
+export const PLATFORMS = Object.freeze(
+  Array.from(new Set([
+    ...URL_PLATFORMS,
+    ...HANDLE_PLATFORMS,
+  ]))
+);
components/scrapecreators/sources/common/base.mjs (1)

41-47: Use Date.now() and emit string ids

Minor polish:

  • Date.parse(new Date()) is wasteful; use Date.now().
  • Normalize event id to string to avoid numeric id collisions across sources.
-        this.$emit(item, {
-          id: item[fieldId],
-          summary: this.getSummary(item),
-          ts: Date.parse(new Date()),
-        });
+        this.$emit(item, {
+          id: String(item[fieldId]),
+          summary: this.getSummary(item),
+          ts: Date.now(),
+        });
components/scrapecreators/actions/search-creators/search-creators.mjs (2)

12-18: Avoid diverging platform options here (potential mismatch with app-level PLATFORMS).

This action overrides the app’s platform propDefinition options with SEARCH_PLATFORMS (currently ["tiktok","threads"]). That may exclude platforms users expect (e.g., YouTube/Instagram per issue), or fall out of sync with app-level PLATFORMS.

Option A: Remove the options override and rely on app’s PLATFORMS.

   platform: {
     propDefinition: [
       app,
       "platform",
     ],
-    options: SEARCH_PLATFORMS,
   },

Option B: If search truly supports a subset, update SEARCH_PLATFORMS in common/constants.mjs to the intended, documented set and add a short note in the action’s description clarifying supported platforms. Based on PR objectives.


47-48: More helpful summary (include platform and count).

Small UX win: show platform and number of results.

-    $.export("$summary", `Successfully searched for **${this.query}**`);
+    $.export("$summary", `Found ${data.length} creator(s) on ${this.platform} for "${this.query}"`);
components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs (1)

25-35: Tighten the summary.

Include platform to aid debugging across multi-platform workflows.

-    const summary = `Successfully fetched creator profile for **${this.profileId}**`;
+    const summary = `Fetched ${this.platform} profile: ${this.profileId}`;
components/scrapecreators/sources/new-profile-update/new-profile-update.mjs (1)

59-62: Use Date.now() directly for timestamps.

Date.parse(new Date()) is equivalent but noisier. Prefer Date.now().

-          ts: Date.parse(new Date()),
+          ts: Date.now(),
components/scrapecreators/scrapecreators.app.mjs (2)

71-84: Simplify parsePath predicate (return boolean).

Current .find() returns a truthy string (key) as the predicate result. Works, but non-idiomatic. Return a boolean for clarity.

-    parsePath(platform) {
-      const path = Object.entries(PATH_PLATFORMS).find(([
-        key,
-        value,
-      ]) => value.includes(platform)
-        ? key
-        : null);
-
-      return path
-        ? path[0] === "empty"
-          ? ""
-          : `/${path[0]}`
-        : "/profile";
-    },
+    parsePath(platform) {
+      const entry = Object.entries(PATH_PLATFORMS)
+        .find(([, value]) => value.includes(platform));
+      return entry
+        ? (entry[0] === "empty" ? "" : `/${entry[0]}`)
+        : "/profile";
+    },

86-113: Do not mutate the caller’s params in paginate; add a default for users.

Avoid side effects across iterations/callers and guard against missing arrays.

-    async *paginate({
-      fn, params = {}, platform, maxResults = null, ...opts
-    }) {
-      let hasMore = false;
-      let count = 0;
-      let newCursor;
-
-      do {
-        params.cursor = newCursor;
-        const {
-          cursor,
-          users,
-        } = await fn({
-          platform,
-          params,
-          ...opts,
-        });
+    async *paginate({ fn, params = {}, platform, maxResults = null, ...opts }) {
+      let hasMore = false;
+      let count = 0;
+      let newCursor;
+      const baseParams = { ...params };
+
+      do {
+        const reqParams = { ...baseParams, cursor: newCursor };
+        const { cursor, users = [] } = await fn({
+          platform,
+          params: reqParams,
+          ...opts,
+        });
         for (const d of users) {
           yield d;
 
           if (maxResults && ++count === maxResults) {
             return count;
           }
         }
 
         newCursor = cursor;
-        hasMore = users.length;
+        hasMore = users.length;
 
       } while (hasMore);
     },
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b8ba5c and 8857995.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (9)
  • components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs (1 hunks)
  • components/scrapecreators/actions/search-creators/search-creators.mjs (1 hunks)
  • components/scrapecreators/common/constants.mjs (1 hunks)
  • components/scrapecreators/common/utils.mjs (1 hunks)
  • components/scrapecreators/package.json (2 hunks)
  • components/scrapecreators/scrapecreators.app.mjs (1 hunks)
  • components/scrapecreators/sources/common/base.mjs (1 hunks)
  • components/scrapecreators/sources/new-profile-update/new-profile-update.mjs (1 hunks)
  • components/scrapecreators/sources/new-profile-update/test-event.mjs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-12-12T19:23:09.039Z
Learnt from: jcortes
PR: PipedreamHQ/pipedream#14935
File: components/sailpoint/package.json:15-18
Timestamp: 2024-12-12T19:23:09.039Z
Learning: When developing Pipedream components, do not add built-in Node.js modules like `fs` to `package.json` dependencies, as they are native modules provided by the Node.js runtime.

Applied to files:

  • components/scrapecreators/package.json
🧬 Code graph analysis (6)
components/scrapecreators/common/utils.mjs (1)
components/scrapecreators/sources/new-profile-update/new-profile-update.mjs (1)
  • diff (51-51)
components/scrapecreators/actions/search-creators/search-creators.mjs (1)
components/scrapecreators/common/constants.mjs (2)
  • SEARCH_PLATFORMS (22-25)
  • SEARCH_PLATFORMS (22-25)
components/scrapecreators/scrapecreators.app.mjs (1)
components/scrapecreators/common/constants.mjs (6)
  • PLATFORMS (39-42)
  • PLATFORMS (39-42)
  • URL_PLATFORMS (16-20)
  • URL_PLATFORMS (16-20)
  • PATH_PLATFORMS (1-14)
  • PATH_PLATFORMS (1-14)
components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs (1)
components/scrapecreators/sources/common/base.mjs (1)
  • response (26-26)
components/scrapecreators/sources/new-profile-update/new-profile-update.mjs (1)
components/scrapecreators/common/utils.mjs (2)
  • diff (2-2)
  • getObjectDiff (1-46)
components/scrapecreators/sources/common/base.mjs (1)
components/scrapecreators/scrapecreators.app.mjs (1)
  • fn (94-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Ensure component commits modify component versions
  • GitHub Check: Verify TypeScript components
  • GitHub Check: Publish TypeScript components
  • GitHub Check: pnpm publish
  • GitHub Check: Lint Code Base
🔇 Additional comments (3)
components/scrapecreators/package.json (1)

15-17: @pipedream/platform version up-to-date

  • The dependency version ^3.1.0 matches the latest release, so no update is needed.
  • Optional: add an "engines": { "node": ">=18" } field to align with the Pipedream runtime.
components/scrapecreators/sources/common/base.mjs (1)

26-27: Verify and handle all possible fn() return shapes in base.mjs

In components/scrapecreators/sources/common/base.mjs (lines 25–27), I couldn’t locate any getFunction() implementations—please confirm what shape(s) fn() returns and update emitEvent to handle { value }, { users }, plain arrays, etc., with a clear fallback and an Array.isArray(response) check before iterating.

components/scrapecreators/scrapecreators.app.mjs (1)

26-30: Optional: include Accept: application/json header. The component correctly exposes this.$auth.api_key, so no rename is needed.

…ors.mjs

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@vunguyenhung vunguyenhung merged commit 8fb1a6b into master Oct 1, 2025
10 checks passed
@vunguyenhung vunguyenhung deleted the 16762-components-scrapecreators branch October 1, 2025 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Components] scrapecreators

4 participants